Counting documents that contain substrings more than k times.
نویسندگان
چکیده
منابع مشابه
Counting common substrings effectively
This article presents effective (dynamic) algorithm for solving a problem of counting the number of substrings of given string which are also substrings of second string. Presented algorithm can be used for example for quick calculation of strings similarity measure using generalized ngram method (Niewiadomski measure [2]), which are shown. Correctness and complexity analyses are included. 1 Oz...
متن کاملClustering Documents with Maximal Substrings
This paper provides experimental results showing that we can use maximal substrings as elementary building blocks of documents in place of the words extracted by a current state-of-the-art supervised word extraction. Maximal substrings are defined as the substrings each giving a smaller number of occurrences even by appending only one character to its head or tail. The main feature of maximal s...
متن کاملDocuments Mean More than Just Paper! 1
With the advent of electronic documents, information is available in more than just its visual form |electronic information is display-independent. Though the principal mode of display is still visual, we can now produce alternative renderings of this information |we have designed a computing system, ASTER, that produces an audio view. The visual mode of communication is characterized by the sp...
متن کاملConversation around documents: more than threading
INTRODUCTION Besides being a great repository of data, the Web is a space for discussion. More and more systems that aim to support distributed group activities are being developed e.g., Yahoo groups and Google groups. Most of these systems offer an option that allows their members posting messages and therefore discussing different topics. These discussions appear mainly in the form of text, t...
متن کاملTimes ” More Difficult to Approximate Than
How many points do we need to approximate a given metric space S (e.g., a ball in the Euclidean space) with a given accuracy ε > 0? To be more precise, how many points do we need to reproduce the metric ρ(X, Y ) on S with an accuracy ε? This problem is known to be equivalent to the following geombinatoric problem: find the smallest number of balls of given radius ε that cover a given set S. A s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Natural Language Processing
سال: 2002
ISSN: 1340-7619,2185-8314
DOI: 10.5715/jnlp.9.5_43